Probabilistic Context-Free Grammars for Phonology

نویسنده

  • Karin Müller
چکیده

We present a phonological probabilistic contextfree grammar, which describes the word and syllable structure of German words. The grammar is trained on a large corpus by a simple supervised method, and evaluated on a syllabification task achieving 96.88% word accuracy on word tokens, and 90.33% on word types. We added rules for English phonemes to the grammar, and trained the enriched grammar on an English corpus. Both grammars are evaluated qualitatively showing that probabilistic context-free grammars can contribute linguistic knowledge to phonology. Our formal approach is multilingual, while the training data is language-dependent.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Studying impressive parameters on the performance of Persian probabilistic context free grammar parser

In linguistics, a tree bank is a parsed text corpus that annotates syntactic or semantic sentence structure. The exploitation of tree bank data has been important ever since the first large-scale tree bank, The Penn Treebank, was published. However, although originating in computational linguistics, the value of tree bank is becoming more widely appreciated in linguistics research as a whole. F...

متن کامل

Estimation of Consistent Probabilistic Context-free Grammars

We consider several empirical estimators for probabilistic context-free grammars, and show that the estimated grammars have the so-called consistency property, under the most general conditions. Our estimators include the widely applied expectation maximization method, used to estimate probabilistic context-free grammars on the basis of unannotated corpora. This solves a problem left open in th...

متن کامل

Probabilistic Context-Free Grammars for Syllabification and Grapheme-to-Phoneme Conversion

We investigated the applicability of probabilistic context-free grammars to syllabi cation and grapheme-to-phoneme conversion. The results show that the standard probability model of context-free grammars performs very well in predicting syllable boundaries. However, our results indicate that the standard probability model does not solve grapheme-to-phoneme conversion su ciently although, we va...

متن کامل

The Generative Power of Probabilistic and Weighted Context-Free Grammars

Over the last decade, probabilistic parsing has become the standard in the parsing literature where one of the purposes of those probabilities is to discard unlikely parses. We investigate the effect that discarding low probability parses has on both the weak and strong generative power of context-free grammars. We prove that probabilistic context-free grammars are more powerful than their non-...

متن کامل

Grammatical Inference of some Probabilistic Context-Free Grammars from Positive Data using Minimum Satisfiability

Recently, different theoretical learning results have been found for a variety of context-free grammar subclasses through the use of distributional learning (Clark, 2010b). However, these results are still not extended to probabilistic grammars. In this work, we give a practical algorithm, with some proven properties, that learns a subclass of probabilistic grammars from positive data. A minimu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002